Genotyping-by-sequencing targets genic regions and improves resolution of genome-wide association studies in autotetraploid potato

Key message De novo genotyping in potato using methylation-sensitive GBS discovers SNPs largely confined to genic or gene-associated regions and displays enhanced effectiveness in estimating LD decay rates, population structure and detecting GWAS associations over ‘fixed’ SNP genotyping platform. Study also reports the genetic architectures including robust sequence-tagged marker–trait associations for sixteen important potato traits potentially carrying higher transferability across a wider range of germplasm. Abstract This study deploys recent advancements in polyploid analytical approaches to perform complex trait analyses in cultivated tetraploid potato. The study employs a ‘fixed’ SNP Infinium array platform and a ‘flexible and open’ genome complexity reduction-based sequencing method (GBS, genotyping-by-sequencing) to perform genome-wide association studies (GWAS) for several key potato traits including the assessment of population structure and linkage disequilibrium (LD) in the studied population. GBS SNPs discovered here were largely confined (~ 90%) to genic or gene-associated regions of the genome demonstrating the utility of using a methylation-sensitive restriction enzyme (PstI) for library construction. As compared to Infinium array SNPs, GBS SNPs displayed enhanced effectiveness in estimating LD decay rates and discriminating population subgroups. GWAS using a combined set of 30,363 SNPs identified 189 unique QTL marker–trait associations (QTL-MTAs) covering all studied traits. The majority of the QTL-MTAs were from GBS SNPs potentially illustrating the effectiveness of marker-dense de novo genotyping platforms in overcoming ascertainment bias and providing a more accurate correction for different levels of relatedness in GWAS models. GWAS also detected QTL ‘hotspots’ for several traits at previously known as well as newly identified genomic locations. Due to the current study exploiting genome-wide genotyping and de novo SNP discovery simultaneously on a large tetraploid panel representing a greater diversity of the cultivated potato gene pool, the reported sequence-tagged MTAs are likely to have higher transferability across a wider range of potato germplasm and increased utility for expediting genomics-assisted breeding for the several complex traits studied. Supplementary Information The online version contains supplementary material available at 10.1007/s00122-024-04651-8.


Introduction
Potato (Solanum tuberosum L.) is one of the world's most economically important food crops and holds major significance for future global food security.Despite its importance and intensive breeding efforts, genetic gains for yield and other complex traits have only been slowly accumulated, and old potato varieties are still widely used in commerce (Bachem et al. 2019;Douches et al. 1996;Piepho et al. 2014).Barriers to improvement are largely due to the complexities caused by autotetraploidy and a highly heterozygous potato genome.Constant progress in developing new and improved potato cultivars must be achieved for meeting the challenges posed by emerging disease threats, unpredictable and accelerating climatic changes, and rising demands for niche products due to expansion of markets in the fresh and the processing sectors.However, only a limited fraction of the available genetic diversity has so far been exploited for pursuing breeding objectives.Concerted research efforts are needed to screen the available genetic diversity for identifying novel and beneficial trait alleles targeting the manifold challenges faced by this key crop plant.Moreover, in order to realize the true potential of genomics-assisted breeding in potato the availability of cost-effective high-throughput genotyping methodologies is essential.The advent of nextgeneration sequencing (NGS) and high-throughput genotyping platforms coupled with the availability of whole-genome sequence data has brought a step change in the way potato genetic studies are conducted.Sequence-based genotyping (SBG) methods are now the preferred approach for highthroughput low-cost genotyping as compared to previously deployed SNP (single-nucleotide polymorphisms) array technologies due to the several advantages they provide.These include reduced cost, greater flexibility and scalability, and the capability to scan polymorphisms without prior knowledge, characteristic of an 'open system' genotyping platform (Sharma and Bryan 2017).
Although next-generation sequencing costs continue to decline, deploying whole-genome sequencing (WGS) for routine genotyping applications is still cost prohibitive, and not usually required for performing genetic analyses such as linkage mapping, genome-wide association studies (GWAS) and genomic selection.A more cost-effective form of SBG employs construction of NGS libraries using genome complexity reduction, and reduced representation mainly achieved through sequence capture or restriction enzyme-based methods.This facilitates interrogation of the same subset of the genome across the target population in a reproducible manner.Many variants of the restriction enzyme-based SBG technique exist, the two most commonly used are restriction-site associated DNA sequencing (RAD-Seq) (Baird et al. 2008) and genotyping-by-sequencing (GBS) (Elshire et al. 2011).GBS does not require DNA size fractionation and, thus, offers a simplified NGS library preparation procedure over RAD-seq.Restriction enzymebased SBG provides several advantages viz.simultaneous marker discovery and genotyping; complexity reduction is easy, rapid, specific and highly reproducible; highly multiplexed and less expensive; reduced sample handling; and few PCR and purification steps (Elshire et al. 2011).However, use of GBS does raise certain issues which need careful consideration.GBS can be prone to ascertainment bias due to differential methylation of the restriction sites if methylation-sensitive restriction enzymes are used.This necessitates collection of experimental material from plants at synchronized growth stages and from similar tissue types to avoid variation arising due to differential methylation patterns.As any other NGS technique, GBS is also affected by potentially large amounts of missing data which can occur due to presence/absence variation, polymorphic/methylated restriction site, library complexity and poor read coverage in the affected regions.
GBS has been exploited in several crop plant species (Furuta et al. 2017;Labate et al. 2020;Medina et al. 2020;Nguyen et al. 2020;Poland et al. 2012;Rabbi et al. 2014;Salinas-Aponte et al. 2014;Willman et al. 2022;Zhebentyayeva et al. 2019) to perform genetic studies including GWAS, and reports in potato are also increasing in number (Bastien et al. 2018;Byrne et al. 2020;Diaz et al. 2021;Sverrisdottir et al. 2017).GWAS offer considerable advantages over use of biparental populations, such as enhanced mapping resolution, a larger number of traits that can be assessed in a single study, as well as increased robustness and transferability of GWAS QTL across a wider germplasm pool.Combining modern genetic methodologies such as GWAS with SBG approaches provide unprecedented advances in trait QTL detection and transitioning towards 'next-generation' potato breeding.Numerous studies have successfully reported application of GWAS in potato (Berdugo-Cely et al. 2017;Carley et al. 2017;D'hoop et al. 2008, 2014;Fischer et al. 2013;Kloosterman et al. 2013;Lindqvist-Kreuze et al. 2014;Rosyara et al. 2016;Schönhals et al. 2016;Sharma et al. 2018;van Eck et al. 2017;Vos et al. 2017).Reports involving other SBG approaches in potato, especially for low cost genome scanning, are also emerging (Leyva-Perez et al. 2022).
The current study reports the development of a robust GBS procedure in potato and demonstrates its efficacy in QTL detection using a diverse potato association panel.Sharma et al. (2018) used this panel of 341 clones to evaluate different GWAS models in tetraploid potato using the Infinium 8k Potato SNP Array (Felcher et al. 2012;Hamilton et al. 2011).Here, we report the results of a GWAS analysis on sixteen key tuber quality and production traits using GBS and Infinium SNPs including assessments of genetic diversity, population structure, and linkage disequilibrium.We also compare the results from the two contrasting genotyping approaches involving fixed (used by Sharma et al. (2018)) and flexible (adopted in the current study) genotyping platforms.

Germplasm, field trials, phenotyping and trait data analyses
Field trials and phenotyping of the GWAS panel were conducted in 2012 and 2013 at two different sites (Cambridge and York, UK) and are described in detail in Sharma et al. (2018).The GWAS panel comprised 341 diverse tetraploid genotypes, including a set of 57 advanced breeding lines (Supplementary Table S1).The panel largely comprised European founder and cultivated germplasm, but it also contained 29 non-European cultivars.This paper reports findings for the following traits measured on this panel in replicated trials (two replicates for each environment), but not all traits were phenotyped in every environment (Supplementary Table S2): after cooking blackening (ACB: For each observation unit, five tubers steamed for 20 min, cut in half and each half assessed for darkening after 30 min; 1-9, 1 severe); average tuber weight (ATW); black dot (BLK); black scurf (BLS); tuber skin brightness (BRT: tuber skin texture; 1-9, 9 very smooth); common scab (CSC) using ADAS Key No. 2.3.1 (Anonymous 1976); tuber dry matter (DRM: percentage dry matter based on air-dried and water-immersed weights); tuber eye depth (EYE: 1-9; 1 deep, 9 shallow); tuber flesh colour (FSH: 1-9; 1 white, 9 deep yellow); height breadth ratio (HBR: measure of canopy architecture); tuber shape (SHP: 1-6, 1 round); tuber sprouting (SPR); tubers per stem (TPS); total tubers per plant (TTU); tuber shape uniformity (UNI: 1-9, 9 very uniform); tuber yield (YLD: mean kg/ plot).Plant and stem counts were performed after full emergence and before canopy closure.For stem counts, branches from main stems were excluded and measurements were taken before the development of secondary stems.For HBR, foliage height (measured from the top of the ridge to the top of the canopy) and breadth (measured across the width of the canopy) measurements were taken when plants reached approximately 80% ground cover but before canopy closure.For DRM, ten tubers (55-65 mm diameter) were gently washed to remove soil and air-dried followed by percentage dry matter calculations (Weltech PW-2050, Weltech International Ltd, UK) using tuber weight measurements in air and water.SPR observations were using tubers (55-65 mm diameter) stored at 10 °C without any sprout suppressants.Data for the longest tuber sprout was recorded eight weeks after 95% of panel tubers have broken dormancy (i.e.sprout length > 3 mm).Each observation unit comprised sprout data from 6 tubers.Tuber storage and sprout measurements were conducted at the Sutton Bridge Crop Storage Research (SBCSR) facility, AHDB, UK.For BLS and CSC, the percentage of surface area affected by these diseases was recorded in 7 categories viz., 0-1, > 1-5, > 5-10, > 10-25, > 25-50, > 50-75 and > 75-100.A severity score for each plot was calculated by multiplying the number of tubers in each category by the mid-point value and dividing the sum of these values by the total number of tubers assessed.For BLK a similar approach was adopted but the percentage of the affected surface area was scored in 6 categories viz., 0-1, > 1-10, > 10-25, > 25-50, > 50-75 and > 75-100.To generate mean phenotypic values for each trait, genotype was modelled as a fixed effect while all other effects were treated as random.For each environment, the best linear unbiased estimates (BLUEs) for all traits were calculated using REML implemented in Genstat 20th edition (VSN International Limited, http:// www.vsni.co.uk).These trait BLUEs were used in GWAS as well as all other analyses involving phenotypic values.Trait pairwise Pearson correlation coefficients were calculated using R package ggstatsplot (Patil 2021) deploying "Bonferroni" adjustment method for p values for multiple comparisons and significance level threshold set to 5%.Trait broad-sense heritabilities (H 2 ) on plot basis were estimated using R package StageWise (Endelman 2023).

Construction of genotyping-by-sequencing libraries and sequencing
Genomic DNA was extracted from young leaf tissue from individual field grown plants using the Qiagen DNeasy Plant Maxi Kit (Qiagen) and quantified using the Quant-iTTM PicoGreen® dsDNA Assay Kit (Invitrogen, San Diego, CA).Previously, the GWAS panel was genotyped using the Infinium 8k Potato SNP Array (Felcher et al. 2012;Hamilton et al. 2011) as detailed in Sharma et al. (2018).The current study extends the genetic characterization of the same GWAS panel using a genotyping-by-sequencing (GBS) procedure adapted from Poland et al. (2012).Different restriction enzyme (RE) combinations (PstI, NsiI and SbfI as rare cutters; MseI and BfaI as frequent cutters) were evaluated for constructing GBS libraries using potato cv.Desiree as a test sample.Actual fragment size distribution from test GBS libraries, in combination with in silico restriction digestion prediction using RestrictionDigest tool (Wang et al. 2016), was assessed to select the optimum restriction enzyme pair for library construction.For constructing final GBS libraries, DNA samples (100 ng each) from the GWAS panel were subjected to double restriction enzyme digestion using the optimum RE pair (PstI-MseI, selection procedure described under "Results" section) followed by adapter ligation in a 24-plex multiplexing format.PstI adapters were barcoded (designed using Deena Bioinformatics, Netherlands) whereas MseI adapters were used as common adapters for all samples (Supplementary Table S3).The processed samples were pooled and PCR-amplified to obtain an NGSready library.Fragment size analysis and additional quality checks on GBS libraries were performed using the 2200 TapeStation system (Agilent Technologies).Each finished GBS library (16 in total) was sequenced on a single lane of Illumina HiSeq 2500 platform to generate 150 bp singleend sequence reads.Illumina sequencing was performed at Edinburgh Genomics, University of Edinburgh, UK.

Read mapping, variant detection and genotype calling
Sequence data from all 16 GBS libraries were deconvoluted into single sample reads using GBS-SNP-CROP-v.4.1 (Melo et al. 2016;Melo and Hale 2019).Processed reads from each accession were quality trimmed using Trimmomatic (Bolger et al. 2014) and mapped onto the potato reference (the doubled monoploid potato S. tuberosum Group Phureja DM 1-3 516 R44; hereafter referred to as DM) genome version 6.1 (Pham et al. 2020; Potato Genome Sequencing Consortium 2011) using Bowtie2 (Langmead and Salzberg 2012) followed by variant discovery using HaplotypeCaller and performing joint genotyping across all samples simultaneously using GenotypeGVCF programmes available through GATK (DePristo et al. 2011;McKenna et al. 2010).Variants were filtered for genotype level read depth (DP > = 15), genotype quality (GQ > = 10) and QualByDepth (QD > = 2) , i.e. the QUAL score normalized by unfiltered read depth of variant samples.This was implemented using GATK VariantFiltration and SelectVariants tools.Functional annotation of GBS SNPs was performed using SNPeff (Cingolani et al. 2012).

Infinium SNP array data analysis
Infinium 8k Potato SNP Array (Felcher et al. 2012;Hamilton et al. 2011) genotyping data, previously reported by Sharma et al. (2018), was reanalysed.SNP allelic dosages (genotypes) were called using fitPoly R package which was able to resolve genotypic classes for 6401 SNPs.SNP genomic coordinates were annotated according to DM assembly v6.1 (Pham et al. 2020).

Genetic diversity and population structure
Population structure in the diversity panel was examined using two clustering methodologies viz., principal components analysis (PCA) and k-means clustering.PCA was performed over genomic relationship matrix implemented in the R package ASRgenomics and scree plot from the PCA analysis was used to examine germplasm diversity and infer the number of subpopulations (Q) in the association panel.K-means clustering was performed using 'find.clusters'function in adegenet (Jombart 2008;Jombart and Ahmed 2011) to identify the number of genotypic groups.The optimal number of k-means [equalling number of subpopulations (Q)] was determined by using the Bayesian information criterion (BIC) as a statistical measure of goodness of fit.The relationship between population groups was also assessed using DAPC (Discriminant analysis of Principal Components).The densities of individuals were plotted as a one-axis density plot (single discriminant function) to visualize the level of discrimination among population groups.A dendrogram heatmap of the genomic relationship matrix was used to visualize the genetic kinship among the genotypes included in the panel.

Genome-wide association analysis
GWAS were performed using GWASpoly version 2.10 ( Rosyara et al. 2016) employing additive, general, simplex dominance, duplex dominance, diplo-general and diplo-additive gene (genetic) action models as described in GWASpoly manual.Each genetic effect model was further evaluated using four different statistical models viz.(1) Naïve model, without controlling any confounding effects, (2) Kinship model, controlling just for individual relatedness (K), (3) population Structure model, controlling for population structure (Q) effects only, and (4) full model, accounting for K as well as Q confounding effects.All these four models are hereafter referred to as Naïve, K, Q and QK models, respectively.Within each genetic model, fitness of different statistical models was evaluated using Quantile-Quantile (Q-Q) plots of the observed versus expected − log 10 (p) values which should follow a uniform distribution under the null hypothesis.Models were ranked using genomic control inflation factor (λ GC ) metric calculated as the median of the resulting Chi-squared test statistics divided by the expected median of the Chi-squared distribution.The LD-based Bonferroni-type multiple testing correction method "M.eff" (with genome-wide α = 0.05) was applied for establishing a p value (− log 10 (p)) detection threshold for statistical significance.The "M.eff" method in GWASpoly implements a Bonferroni-type correction using an effective number of markers that accounts for LD between markers (Moskvina and Schmidt 2008).For selecting a single marker per QTL, markers with scores above the set threshold for statistical significance were filtered to obtain the most significant marker-trait association (MTA) within a specified window set using the average extent of LD decay in the panel.

Estimation of linkage disequilibrium (LD) decay
Whole chromosome-scale LD was estimated following the procedure as previously described (Sharma et al. 2018;Vos et al. 2017).Briefly, Pearson correlation coefficient (r 2 ) was used to calculate correlations between marker-pairs using SNP dosage scores (0-4) and LD was estimated based on marker pairs located within each of the 12 individual chromosomes.Extent of LD decay was estimated by implementing quantile regression (R package 'quantreg'; Koenker 2017) on the 90th percentile.From the fitted regression, two LD estimators were obtained, viz.LD 1/2max,90 and LD 1/10,90 , denoting the distances (in Mb) at which LD equals one-half of its maximum fitted r 2 value (r 2 max,90 ) and where r 2 reaches one-tenth on the 90th percentile, respectively.LD estimates, for both datasets (GBS and Infinium SNP array), were computed with SNPs used for performing GWAS.

Optimization of restriction enzyme combination for constructing GBS libraries
A total of six restriction enzyme (RE) pairs, comprising one 'rare' cutter (PstI: methylation-sensitive 6-base cutter, NsiI: 6-base cutter, or SbfI: 8-base cutter) and one 'frequent' cutter (MseI: 4-base cutter or BfaI: 4-base cutter), were examined for their suitability to construct potato GBS libraries.The fragment size distribution from test GBS libraries obtained using these six RE pairs as well as that predicted from in silico restriction digestion is illustrated in Fig. 1a, b.The number of fragments (GBS-tags) generated using NsiI combination pairs was relatively high especially in the smaller fragment-size range (0-100 bp).The overall number of fragments generated using SbfI combination pairs was low considering it being an 8-base cutter.As compared to the former two REs, number and distribution of GBS-tags generated using PstI combination pairs was found to be in the optimum fragment size and number range for constructing GBS libraries.The propensity of GBS tags in the size-range beyond that is suitable (i.e.200-500 bp, excluding adapters (Quail et al. 2009)) for effective cluster amplification during standard Illumina short-read sequencing was higher in libraries generated using BfaI in combination with all three rare cutters as compared to those obtained using MseI.Considering the number and fragment-size distribution obtained from test libraries, a combination of 'PstI/MseI' restriction enzymes was chosen to construct final GBS libraries.A 24-plex multiplexing regime was followed requiring the generation of sixteen GBS libraries containing 361 tetraploid clones (341 net samples, 15 Desiree controls and 5 samples with identity issues later removed from the study), 22 diploid clones (including 16 DM control samples) and one hexaploid genotype (Supplementary Table S4).Figure 1c shows fragment size distribution for one of these 'Illumina sequencing ready' 24-plex GBS libraries constructed here.

Genome-wide SNP discovery using genotyping-by-sequencing
Sequencing on an Illumina HiSeq 2500 yielded ~ 1889 million reads in total averaging 118 million reads per GBS library.Sample deconvolution using GBS-SNP-CROPv.4.1 (Melo et al. 2016;Melo and Hale 2019) S5).Using the read mapping and variant discovery procedure, as described in the "Materials and methods" section, a total of 210,614 polymorphisms (204,397 SNPs and 6217 indels) were initially identified across all samples included in the GBS panel.Of these variants, 192,282 SNPs were identified for the 341-clone tetraploid set (hereafter referred to as 'total-tetra-set') and 63,743 SNPs were obtained in the non-tetraploid (2× and 6×) genotype set (not part of the current study) indicating an average of one GBS SNP per ~ 3.9 kb and ~ 11.6 kb for both germplasm panels, respectively.The genomic locations of 'total-tetra-set' SNPs intersected 24,181 genes (DMv6.1, data not shown).
Filtering of 'total-tetra-set' markers for low sample genotype level read depth (DP < 15), low genotype quality (GQ < 10), low quality-by-depth (QD < 2.0), variant type (only biallelic SNPs) and excluding non-variants, yielded 116,048 high-quality biallelic single-nucleotide variants (SNVs).Of these, 99.95% SNPs (115,985) were physically mapped across 12 potato chromosomes with an average of one SNP per ~ 6.4 kb.The highest and lowest marker rates were observed on chromosome 2 and chromosome 10 with an average of one SNP per ~ 4.0 kb and ~ 8.8 kb, respectively.The remaining 63 SNPs were physically mapped on unanchored superscaffolds (chromosome 0) of the potato genome.Figure 2 displays genome-wide SNP density plot and the distribution of SNPs by chromosomes is given in Table 1.The panel showed a Ts/Tv (Transitions/Transversions) ratio of 1.86 for the observed SNPs while the missense-to-silent ratio for 'total-tetra-set' SNPs was 0.85 (missense: 45.86%; silent: 53.74%; nonsense: 0.41%).The percentage of SNPs with high, low, moderate and modifier impact categories was 0.18, 15.22, 12.11 and 72.50, respectively.The filtered 'total-tetra-set' SNPs overlapped 19,998 genes, and the list of these genes along with SNP count affecting each gene and their respective impact categories is provided in Supplementary Table S6.The number of SNPs in different effect-type categories and regions is provided in Table 2.

Genetic characterization and population structure analysis
The GWAS panel (pheno-tetra-set) was genetically characterized using SNPs detected by GBS in the current study as well as using the Infinium 8k Potato SNP Array (Felcher et al. 2012;Hamilton et al. 2011) previously reported by Sharma et al. (2018), as described in the "Materials and methods" section.Processing of Infinium array SNPs using the filtering criteria as described in the preceding section for GBS variants provided an effective set of 5845 SNPs. Figure 3 displays pattern of distribution and marker density for 24,518 GBS and 5845 Infinium SNPs used for all subsequent analyses.Population structure was analysed using principal component analysis (PCA) and K-means clustering.Assessment of the scree plot (point of inflection/elbow junction) from the PCA analysis indicated the presence of three subpopulations (Q) within the association panel (Supplementary Fig. S1).This was further corroborated by the cluster detection based on Bayesian information criterion (BIC) which supported the presence of three clusters as 'k = 3' was within  the shallow minimum of the BIC 'goodness of fit' curve (Supplementary Fig. S2) indicating three clusters are most optimum for classifying genotypes in the panel.The pattern of population stratification and the percentage of the genetic variation accounted by the first three components of the PCA using GBS and Infinium array SNPs is illustrated in Fig. 4. The relationship between subpopulation groups was also assessed using DAPC (Discriminant analysis of Principal Components DAPC).Visualizing densities of individuals on a single-axis density plot (single discriminant function) revealed enhanced level of discrimination among the three subpopulation groups by using GBS SNPs than that achieved by the Infinium SNPs (Supplementary Fig. S3).
The K-means-based cluster (subpopulations) membership for the GWAS panel clones along with their market segment, year of release and country of origin details is provided in Supplementary Table S7.The panel subpopulations did not display any correspondence with clone market segment categories (Supplementary Table S7).However, as reported by Sharma et al. (2018), 'country of origin' did show some association with PCA subpopulations where cluster 1 predominantly contain modern UK cultivars while the other two clusters include old and modern cultivars from the UK, European and non-European countries.The genetic kinship among the genotypes included in the GWAS panel was visualized as a dendrogram heatmap of the genomic relationship matrix (Fig. 5). Figure 6 displays histogram plots for the diagonal and off-diagonal values of the genomic relationship matrix using GBS and Infinium SNPs.The diagonal histogram plot for both SNP datasets is centred around one (GBS: 1.16; Infinium: 1.00) as expected for non-inbred individuals.Similarly, the off-diagonal histogram plot for both SNP datasets is centred around zero (GBS: − 0.0040; Infinium: − 0.0034) as presumed for the unrelated individuals.

Linkage disequilibrium analysis
Linkage disequilibrium in the GWAS panel was assessed using Pearson's r 2 statistic using pairwise combinations of SNPs present across all 12 chromosomes.The extent of LD decay was estimated at the whole chromosome level using two LD estimators, viz.LD 1/2max,90 and LD 1/10,90 , denoting the distances (in Mb) at which LD equals one-half of its maximum fitted r 2 value (r 2 max,90 ) and where r 2 reaches onetenth on the 90th percentile, respectively.The analysis was The extent of LD decay in the GWAS panel using both SNP datasets is illustrated in Fig. 7, and the r 2 max,90 , LD 1/2max,90 and LD 1/10,90 estimates from the two datasets are provided in Table 3.

Phenotyping
The association panel was phenotyped for sixteen production, agronomic and processing traits as described in the "Materials and methods" section.The GWAS panel displays a broad range of phenotypic diversity for most traits, and largely the residuals for all traits were normally distributed (Supplementary Fig. S4).As the panel largely contains  S7 (GBS: Cluster1 = red, Cluster2 = green, Cluster3 = cyan; Infinium: Cluster1 = green, Cluster2 = red, Cluster3 = cyan) cultivated potato varieties, for some commercially important traits the phenotypic scores are biased due to selection for the more favourable scores; nevertheless, the variation was sufficient for performing genetic analysis on these traits.
Average broad-sense heritability (H 2 , plot basis) estimates showed a wide variation (0.18-0.91), but for most traits the H 2 values were in the moderate-to-high range (0.4-0.8), with three (black scurf, after cooking blackening and tuber uniformity) and four (tuber shape, average tuber weight, tuber dry matter and tuber flesh colour) traits displaying low and very high heritabilities, respectively (Table 4).Significant correlations were observed between some traits, while correlations between most trait-pair combinations were nonsignificant (Fig. 8).

GWAS analyses
GWAS analyses utilized phenotypic and genotypic data from 'pheno-tetra-set' comprising 290 tetraploid clones.For each trait, GWAS were conducted using (a) only GBS SNPs (24,518), (b) only Infinium SNPs (5845), and (c) a combined (GBS and Infinium) marker set (30,363 SNPs); each GWAS-type hereafter referred to as GBS-GWAS, Infinium-GWAS and GBS-Infinium-GWAS, respectively.All GWAS analyses were performed using the autotetraploid genotype model (coding dosage level for alternate allele as 0, 1, 2, 3 or 4 for each biallelic SNP), and employing all six (additive, general, simplex dominance, duplex dominance, diplo-general and diplo-additive) gene action models as described in the R package GWASpoly manual.Each gene action model was further evaluated using Naïve, K, Q and QK models, and fitness of these statistical models was evaluated using Q-Q plots and genomic control inflation factor (λ GC ) metric as described in the "Materials and methods" section.First three principal components (PC) were deemed relevant for describing population structure based on the point of inflection observed in the PCA scree plot (Supplementary Fig. S1) and BIC 'cluster detection' goodness of fit curve (Supplementary Fig. S2); and were included to form the Q matrix in Q and QK GWAS models for controlling population confounding effects.The marker-trait associations (MTAs) were declared statistically significant on the basis of p value (− log10(p)) detection threshold established using the "M.eff" method (implemented in GWASpoly) to control the genome-wide false positive rate (α = 0.05).
For simplification and brevity, we focus here predominantly on results from the GBS-Infinium-GWAS analysis.Supplementary Fig. S5 displays Q-Q plots comparing the  S8 presents genomic control inflation factor (λ GC ) values for each trait and 'GWAS genetic model × statistical model' combination.Supplementary Fig. S6 illustrates individual trait Manhattan plots from all GWAS and gene action models analysed here while the combined Manhattan plots from K and QK models for all 16 traits and each gene action model are presented in Fig. 9.The initial GWAS scan resulted in a non-redundant set of 1181 significant MTAs from K and QK models (redundancy for same SNP MTAs appearing for more than one trait was not removed) covering all traits assessed in the study including MTAs appearing with multiple gene action models for the same SNP (Supplementary Table S9a).These initial MTAs were filtered for selecting the most significant marker per LD-based moving window scan and a single MTA per LD-based window (genomic interval) was retained (hereafter referred to as QTL-MTA) for each gene action model GWAS.The window-size (1.52 Mb) for LD-based scan was set using the average extent of LD decay (average LD 1/10,90 distance derived using GBS SNPs) in the panel.The most significant QTL-MTAs from K and QK models for each GWAS-type and trait are listed in Supplementary Table S9b, where all QTL-MTAs for SNPs appearing with more than one gene action model are included.Table 5 lists a nonredundant set of 189 unique QTL-MTAs where only a single QTL-MTA, for those appearing with multiple gene action models, with the highest significance value was retained and forms the basis of GWAS results presented in this study.Graphical representation of MTAs listed in Table 5 is presented in Fig. 10.

GBS targets genic regions and provides effective assessment of population structure and LD decay rates
A number of restriction-enzyme pairs were evaluated for constructing potato GBS libraries where the PstI-MseI pair was found to be most effective in terms of the number and distribution of GBS-tags in the optimum range for constructing GBS libraries.Furthermore, visualization of Fig. 7 Linkage disequilibrium (LD) measure r 2 in the potato association panel plotted versus the physical map distance (Mb) between pairs of SNPs (a, GBS SNPs; b, Infinium array SNPs) located on the whole chromosomal region for all 12 chromosomes.The trend line of the nonlinear quantile regression of r 2 (90th percentile) versus the physical map distance between the SNP markers is illustrated in red while dashed blue line depicts the standard LD decay threshold (r 2 = 0.1) ▸ the genome-wide distribution of the SNPs obtained from PstI-MseI GBS libraries revealed enrichment of GBS SNPs in the euchromatic regions and their avoidance in repetitive regions with the SNP distribution closely matching that of gene density (Fig. 2).The reported GBS method uses a methylation-sensitive restriction enzyme PstI as the 'rare' cutter, which avoids repetitive regions of the genome.Consequently, despite it being a non-targeted approach GBS-tags are mostly confined to the euchromatic fraction of the genome.Further genomic annotation using SNPeff (Cingolani et al. 2012) revealed 10.4% of GBS SNPs are located in intergenic regions, while 89.6% in genic (47.6%) or gene-associated (42%; 5 k bases up/downstream of gene boundaries) parts of the genome.
In accordance with the previous genetic studies (Dhoop et al. 2008;Sharma et al. 2018;Stich et al. 2013) citing a weak population structure in potato, although there was no clear visual separation of clones into distinct subpopulations, the genetic characterization of the potato panel using GBS and Infinium SNPs detected subtle grouping of the  2018).While the current study reports a similar trend of LD decay for chromosome 8 (LD 1/10,90 : 25.23 Mb) using Infinium array SNPs, the LD 1/10,90 measure for chromosome 8 using GBS SNPs is only 1.48 Mb which aligns well with the LD decay rates observed for the other chromosomes.The enhanced efficiency of GBS-SNPs in capturing LD decay rates could be attributed to the increased enrichment of GBS SNPs in the euchromatic region as well as more effective coverage of the heterochromatic parts of the genome as compared to Infinium SNPs (Fig. 3).Although the pattern (Fig. 3) and proportion (data not shown) of GBS SNPs in euchromatic and heterochromatic regions, as described by Sharma et al. (2013) and Leyva-Perez et al. ( 2022), for chromosome 8 did not significantly differ from rest of the chromosomes, the pronounced shift in LD decay rate for this chromosome using GBS SNPs is somewhat intriguing.

Complex interplay of traits in the potato panel
The association panel was phenotyped for sixteen production, agronomic and processing traits.Significant correlations were observed between some traits while correlations between most trait-pair combinations were non-significant (Fig. 8).For example, average tuber weight shows a strong negative correlation (− 0.68) with total tubers per plant, but a strong positive correlation (+ 0.63) with tuber yield.These observations are expected as larger tubers result in a greater yield, and fewer tubers per plant would result in less competition between tubers for photosynthates leading to an increase in tuber size.Tuber dry matter shows significant negative correlations with yield (− 0.23) and average tuber weight (− 0.34) which aligns well with the tendency for greater yields if tubers contain more water.Tubers per plant and tubers per stem show a positive strong correlation (+ 0.32) as expected.Tuber dry matter also shows a significant negative correlation with tuber skin brightness (− 0.33) which potentially could be an artefact of selection as varieties bred for fresh market need a better  skin finish to gain acceptance but for the processing sector, largely favouring varieties with higher dry matter, skin finish is far less important.A negative correlation (− 0.27) between tuber dry matter and tuber eye depth (EYE scoring: 1 deep, 9 shallow) is interesting and emphasizes the challenges for breeding varieties suitable for processing (i.e. with high dry matter content) while selecting for shallow eye depth.Tuber eye depth also shows a strong positive correlation with tuber shape (0.47; SHP: 1-6, 1 round), which in turn displays a significant negative correlation with tuber dry matter (− 0.24).The latter could also be affected by selection as higher dry matter and round tuber shape are amongst the highly desirable traits for the processing industry, mainly the potato crisp/ chip market segment.Tuber shape shows a significant correlation (− 0.42) with uniformity (UNI: 1-9, 9 very uniform).This potentially could be due to the observations that round tubers are more likely to retain uniformity in shape as they increase in size, however, the longer tubers will be more prone to display variation in tuber length because of the possibility of the presence of a number of smaller tubers which haven't yet fully elongated.After cooking blackening displays significant correlations with average tuber weight (0.22) and tuber yield (0.30) but a strong negative correlation with tuber dry matter (− 0.42; ACB scoring, 1 severe).Significant correlation between tuber bruising (enzymatic browning) and dry matter is previously reported (Urbany et al. 2011) and the ACB trait (non-enzymatic browning) assessed here seems to display similar patterns.These observations also align well with the negative correlation of tuber dry matter described above with tuber yield and average tuber weight.After cooking blackening also shows a significant negative correlation (− 0.26) with tuber flesh colour suggesting white tubers (low FSH value) have on average a lower tendency to darken after cooking (high ACB value).

GBS enhances marker saturation and QTL resolution for GWAS
GWAS were performed for all sixteen traits assessed in the study using a combined marker set of GBS and Infinium  5. Black dashed line depicts GWAS main significance threshold set to 1e-5 − log10(p) value, while actual significance thresholds obtained separately for each trait MTA using Bonferroni-type multiple testing correction method "M.eff" (with genome-wide α = 0.05) are provided in Table 5.The chromosome heatmaps below each plot illustrate SNP density (bin size = 1 Mb) SNPs.GWAS using individual SNP datasets (viz.GBS and Infinium) were also performed but results mainly using the combined SNP dataset are discussed here.As previously reported by Sharma et al. (2018), the performance of K and QK models was the most effective and generally in agreement with each other while the N model was least efficient amongst the four tested models in controlling population structure effects (Supplementary Table S8, Supplementary Fig. S5).For almost every trait, the QTL-MTA observed in K and QK models corroborated each other.However, in a few cases the results differed (i.e.QTL-MTA absent in either K or QK model) for one (tubers per stem, height breadth ratio, tuber skin brightness and tuber dry matter), two (common scab, after cooking blackening, black dot, black scurf and eye depth), four (tuber sprouting), five (tuber yield) and eleven (tuber flesh colour) QTL-MTAs.Most traits had at least a single QTL-MTA obtained from both marker sets, while QTL-MTAs for one trait (tuber uniformity) belonged to Infinium array and for four traits (average tuber weight, tuber dry matter, total tubers per plant and tuber yield) to GBS SNPs only.For 'black dot', GBS-Infinium-GWAS did not result in any MTAs possibly due to the increased significance threshold with the higher number of markers employed, and results from the GBS-GWAS and Infinium-GWAS are included.Although the described QTL-MTAs for 'black dot' (obtained using individual SNP dataset GWAS) did not achieve the set significance threshold using the combined SNP set analysis, they still showed the highest association strength in the reported genetic model categories for GBS-Infinium-GWAS.For the non-redundant set of 189 QTL-MTAs obtained from GBS-Infinium-GWAS, the majority of significant QTL-MTAs obtained were from GBS SNPs (42 Infinium vs. 147 GBS QTL-MTAs) highlighting their proximity to the causal variation and increased marker saturation and resolution achieved through GBS.The GWAS results for each of the traits studied are discussed in the light of previous published studies in more detail below:

After cooking blackening
After-cooking blackening (ACB) or darkening results from a non-enzymatic grey-black discoloration of the potato tubers after they are cooked using methods such as boiling, baking, frying and dehydration (Hughes and Swain 1962).ACB has a direct impact on the quality and marketability of potato products and is regarded as one of the most undesirable quality defects of potatoes.Previously, QTL for ACB have been reported on chromosomes 2, 4, 6, and 11 including an undefined homology group 'C' (Bradshaw et al. 2008) in a tetraploid biparental mapping population.However, the current GWAS for ACB revealed QTL-MTAs on all chromosomes except 4 and 11 (Table 5; Figs. 9, 10; Supplementary Fig. S6).The very poor resolution of the genetic map in the previous study, and also the lack of sequence-tagged markers prohibit detailed comparisons of the common QTL (chromosomes 2 and 6) identified in the two studies.However, the common QTL reported here appear to be in approximately the same locations as those detected in the earlier publication, that is towards the 'bottom' end of chromosome 2 and in the 'mid-to-bottom' region of chromosome 6.
Besides ACB, potato also exhibits other forms of tuber darkening, such as tuber bruising, resulting from enzymatic reactions mainly controlled by polyphenol oxidases (PPOs).Chi et al. (2014).A previous study (Urbany et al. 2011) has reported tuber bruising QTL on chromosome 8 without providing precise genomic location details, and a recent study by Angelin-Bonnet et al. (2023) also detected QTL for this trait on chromosomes 8 (~ 46 Mb, DMv4.03 (Sharma et al. 2013)) in close vicinity to the genomic location of tuber specific StPPOs (45.8 Mb;Chi et al. 2014) catalysing the production of melanin pigments that gives the bruising colour.Both studies also reported additional QTLs on chromosomes 2 and 3 (Urbany et al. 2011), and1 and7 (Angelin-Bonnet et al. 2023).The interval of chromosome 8 QTLs for ACB detected here is also located in the same region as tuber specific PPOs possibly with revised genomic coordinates (53-59 Mb) in the latest DM assembly version 6.1 (Pham et al. 2020) and in close vicinity to a PPO domain containing protein (48.2 Mb, DMv6.1).Moreover, one of the QTLs on chromosome 3 identified here is located close to the putatively annotated genes (4-coumarate:CoA ligase at 45.7 Mb and Glycogen/starch synthases, ADP-glucose type at 47.1 Mb) reported to be implicated in tuber bruising by Urbany et al. (2011).These observations reveal the cross-talk between pathways controlling both enzymatic and non-enzymatic discoloration of potato tubers and provide effective breeding targets for developing potatoes resistant to tuber darkening caused by multiple factors.

Black dot
Black dot, caused by Colletotrichum coccodes, was initially considered as a mild disease of potato, largely infecting weakened plants.However, in the last two decades the pathogen (C.coccodes) has been reported to infect roots and stems quite early in the growing season, causing early foliage death either by itself or in association with other pathogens, reducing plant and root growth as well as leading to reduced potato yields (Johnson et al. 2018).Furthermore, at the tuber phase the fungus infection causes unsightly potato blemishing, affecting tuber quality and marketability including significant water losses during storage, a factor receiving increased economic impact due to the changes in the marketing of fresh tubers, particularly with respect to the growing demand for washed potatoes (Massana-Codina et al. 2021).The GWAS for black dot performed here identified QTL-MTAs on chromosome 8 (Table 5; Figs. 9, 10; Supplementary Fig. S6).To the best of our knowledge, there have been no previous studies mapping this trait in potato, and the findings reported here could pave the way for gaining a basic understanding of genetic resistance to black dot with a view to developing black dot resistant cultivars.

Black scurf
Black scurf, caused by the basidiomycete Thanatephorus cucumeris (Rhizoctonia solani), refers to the formation of sclerotia on potato tubers (Carling et al. 1989).The disease causes substantial yield reductions ranging from 5 to 10% including economic losses due to significant decreases in quality especially in washed tubers with fine skin (Errampalli and Johnston 2001).To date, no genetic study has been conducted to identify genetic factors controlling black scurf in potato.The black scurf GWAS reported here detected significant QTL-MTAs on chromosomes 1,2,3,4,5,7,8,9,10 and 11 (Table 5;Figs. 9,10; Supplementary Fig. S6), suggesting a very complex genetic architecture of susceptibility to this economically important disease.

Common scab
Common scab, caused by Streptomyces species such as Streptomyces scabies, S. acidiscabies and S. turgidiscabies (Lambert and Loria 1989;Miyajima et al. 1998), is an unsightly blemish disease of potatoes occurring worldwide.Previous genetic mapping studies have identified common scab QTLs on several chromosomes viz., 1, 2, 3, 4, 6, 9, 11 and 12 (Bradshaw et al. 2008;Braun et al. 2017;Enciso-Rodriguez et al. 2018;Kaiser et al. 2020;Koizumi et al. 2021;Pereira et al. 2021a;Yuan et al. 2020) using diploid and tetraploid populations and deploying a range of marker types.GWAS here identified QTL-MTAs on chromosomes 1, 8, 11 and 12 (Table 5; Figs. 9, 10; Supplementary Fig. S6).Koizumi et al. (2021) report no agreement in regard to QTL locations and their genetic modes among the results of all previous investigations including that from their own study.However, apart from a chromosome 8 QTL, all other QTL observed here find good correspondence with the previous studies.The chromosome 1 QTL-MTA at ~ 64.5 Mb is in close vicinity to the QTL region (73.7 Mb to 75.3 Mb) on the same chromosome detected by Kaiser et al. (2020).Koizumi et al. ( 2021) also recognized effective SNPs in this region, but the significance level was below the set threshold possibly due to the small population size of their GWAS panel as postulated by them.The chromosome 11 QTL-MTA at ~ 40.2 Mb (chr11_40241089) reported here corroborates the chromosome 11 QTLs (35 cM and 41.2 cM) described by Braun et al. (2017) as inferred (1 cM = ~ 0.9 Mb) by equating the total genetic map length (822 cM) the latter study obtained to the physical size (741.6Mb) of the published potato genome (Pham et al. 2020).Similarly, the chromosome 12 QTL-MTA at ~ 11.2 Mb (chr12_11240374) detected here is in close proximity to the chromosome 12 MTAs (at ~ 9.5 Mb) reported by Yuan et al. (2020) with both studies reporting 'duplex dominance' gene action model for their respective MTAs.This is in disagreement to the observations made by Koizumi et al. (2021) that most 'common scab' QTLs reported in different studies don't correspond with each other.Furthermore, the validation of three QTL regions (chromosomes 1, 11 and 12) observed here with the previously described ones, suggests the robustness and higher transferability of these MTAs to a wider germplasm.The chromosome 8 QTL-MTA reported is novel to the current study and provides an additional breeding target to develop common scab resistant potato varieties.

Tuber shape uniformity
Tuber shape uniformity is an important trait in potato breeding which has significant impact on consumer acceptance and industrial processing.Studies (Bradshaw et al. 2008;Hara-Skrzypiec et al. 2018;Śliwka et al. 2008) have been conducted to unravel the genetic factors controlling this trait, with QTLs identified on several chromosomes (1, 2, 3, 4, 5, 8, and 11).Previously, this trait has been assessed in biparental mapping populations using the term 'regularity of tuber shape' (here after referred to as 'regularity') which includes many components such as depths of indentations, presence of various tuber defects and all other deviations from an 'ideal' tuber shape (Domański 2001).However, not all studies fully define the metrics used to score the regularity trait.There are no studies reported on tuber uniformity per se, however, as the regularity trait has the tuber shape uniformity as one of the major components, the results are discussed in the context of the previous studies.Hara-Skrzypiec et al. (2018) reported seven QTL for regularity trait on chromosomes 1, 3, 4, 5, and 8. On some of these previously reported chromosomes (1, 4 and 8) and also chromosome 12, the results obtained here revealed QTL peaks reaching close to the significance level, but the set stringency threshold was not achieved (Figs. 9, 10; Supplementary Fig. S6).GWAS here also detected significant tuber uniformity QTL localized in Ro locus on chromosome 10 (solcap_snp_c2_25510, chr10:49,532,389 bp, Table 5) which, to the best of our knowledge, is not reported previously.

Tuber skin brightness (texture)
Tuber skin texture is an important trait in potato because a smooth 'skin finish' is highly desirable for fresh-market varieties.The only published study (McCord et al. 2011) mapping this trait in potato detected multiple QTL (19 in total) spanning chromosomes 2-6, 9, 10 and 12. GWAS here identified significant QTL on chromosomes 2 and 6 (Table 5; Figs. 9, 10; Supplementary Fig. S6).Two of the five QTL-MTAs (at ~ 43.2 Mb, Table 5) on chromosome 2 detected here are in close vicinity to one of the three chromosome 2 QTLs identified by McCord et al. (2011) mapping to 70 cM (~ 49 Mb).The genetic to physical distance conversion is calculated as described above ('Atlantic' parent; total genetic map length: 1059.4 cM; cM-to-Mb conversion factor: 0.7; physical genome size: 741.6 Mb (Pham et al. 2020)).The chromosome 6 QTL-MTA and the remaining chromosome 2 QTL-MTAs were at different genomic locations than reported in McCord et al. (2011) and are novel to this study.GWAS here also revealed effective QTL peaks on chromosomes 1, 3, 4, 9 and 12 but remained marginally below the set significance threshold (Figs. 9, 10; Supplementary Fig. S6).Of these, QTL on chromosomes 3 and 12 show a good correspondence with those reported by McCord et al. (2011), while chromosome 1 QTL is absent, chromosome 4 QTL has only one marker (without any location details) and the chromosome 9 QTL is in a discordant location (than observed here) in the work presented by these authors.

Tuber flesh colour
Tuber flesh colour in potato is determined primarily by the carotenoid content, and its intensity varies from white to deep yellow, in the absence of other types of pigmented compound (e.g.anthocyanins).The yellow flesh locus (Y-locus) controlling the white-to-yellow flesh colour in potato maps to chromosome 3 (Bonierbale et al. 1988;Jacobs et al. 1995).The presence of a dominant allele 'Or' at Y-locus or in close vicinity determines orange flesh colour (Brown et al. 1993).GWAS conducted here revealed QTL on chromosomes 2, 3, 5, 7, 8, 10 and 11 (Table 5; Figs. 9, 10; Supplementary Fig. S6).The strongest association (chr03_44033492) was observed at 44.03 Mb on Chromosome 3, just 1.14 Mb away from β-carotene hydroxylase gene (Soltu.DM.03G018410, 42893526-42895894, DMv6.1 (Pham et al. 2020)) which is considered as the main regulator of flesh colour in potato (Kloosterman et al. 2010) and has been verified in a number of studies (D'hoop et al. 2008;Endelman and Jansky 2016;Hara-Skrzypiec et al. 2018).Besides the chromosome 3 locus, control of flesh colour by QTLs residing on several other chromosomes (1,2,4,6,7,9,10,12) is also reported (Campbell et al. 2014;D'hoop et al. 2008;Endelman and Jansky 2016;Hara-Skrzypiec et al. 2018;Li et al. 2021;Śliwka et al. 2008).Our analysis revealed two QTL-MTAs (chr02_32301311 and chr02_45195794) on chromosome 2 at 32.3 Mb and 45.2 Mb.The latter QTL-MTA (at 45.2 Mb) shows a good agreement with one of the QTL (chromosome 2, 48.52 Mb, DMv4.03 (Sharma et al. 2013)) for anthocyanidins reported by Parra-Galindo et al. (2019), and grossly corresponds with the QTL region on the bottom part of chromosome 2 (76.2-77.5 cM; 58.1-59.1 Mb; see above for cMto-Mb conversion) reported by Hara-Skrzypiec et al. (2018).Zhang et al. (2009), using non-sequence-tagged markers, identified QTL for tuber flesh colour on the top, top and midto-bottom parts of chromosomes 5, 8, and 9, respectively.While GWAS here did not reveal QTL on chromosome 9, the QTL-MTAs detected on chromosomes 5 (chr05_10586111) and 8 (chr08_4816673) appear to be coinciding with the QTL detected by Zhang et al. (2009), albeit after accounting for possible reverse genetic orientation for chromosome 8 reported in their study.D'hoop et al. (2008) reported a tuber flesh colour QTL on chromosome 7 which, though positional details are not divulged, supports the two QTL regions (chr07_53228342 and chr07_55218233) on chromosome 7 identified here.Li et al. (2021) found a major QTL for tuber flesh colour on chromosome 10 co-localizing with the I gene (StMYB88) reported to control tuber peel colour.These authors further revealed that StMYB88 and one other Myb family gene (StMYB89) located on chromosome 10 (51.7-52.3Mb, DMv4.03) were closely related to regulating anthocyanin biosynthesis of tuber flesh.Parra-Galindo et al. ( 2019) also detected several QTL (10 QTL at ~ 52 Mb and 1 QTL at ~ 57.3 Mb, DMv4.03 (Sharma et al. 2013)) for anthocyanidins content on chromosome 10 in the similar region.Furthermore, a QTL for 'Pigmented skin intensity' (Pf) on chromosome 10 (53.48-56.11 Mb, DMv4.03 (Sharma et al. 2013)) is reported by Endelman and Jansky (2016) supporting the genetic interactions controlling tuber flesh and skin colour stated by Li et al. (2021) as described above.GWAS here identified two QTL-MTAs on chromosome 10 (chr10_1154771 and chr10_53541706).Of these, the latter QTL-MTA at ~ 53.5 Mb coincides with the chromosome 10 QTL reported in above studies further supporting the importance of this region in regulating tuber flesh colour.The current study also detected a QTL-MTA on chromosome 11 (chr11_44481634) which finds good correspondence with a QTL (~ 40 Mb, DMv4.03 (Sharma et al. 2013)) for anthocyanidins content reported by Parra-Galindo et al. (2019) on the same chromosome.

Tuber sprouting (dormancy)
Tuber sprouting is one of the commercially important potato traits and critical for the long-term tuber storage required to ensure year-round availability as premature dormancy release and sprout growth in tubers during storage can result in a significant deterioration in product quality.Therefore, breeding potato cultivars with extended or otherwise modified dormancy or sprout vigour is a desirable goal (Alamar et al. 2017;Sharma et al. 2021).Tuber dormancy release and sprout growth are two different physiological phenomena and are often confounded in QTL studies, thereby contributing to the complex genetic architecture of this compound trait.Previous genetic studies (Freyre et al. 1994;Sharma et al. 2021;Simko et al. 1997;Simmonds 1964;van den Berg et al. 1996) on tuber dormancy have detected QTL effects on all chromosomes except chromosome 12.The analysis conducted here also revealed the presence of QTL on several (1, 2, 3, 4, 5, 6, 7, 9, 11 and 12) chromosomes (Table 5; Figs. 9, 10; Supplementary Fig. S6).Due to the very poor resolution of the genetic map in the previous studies, except by Sharma et al. (2021), and largely the deployment of non-sequence-tagged markers, detailed comparisons with the current study are not possible, however, many QTL reported here appear to be in approximately the same genomic locations as reported previously including the detection of novel QTL regions (Table 5).A much more recent study (Sharma et al. 2021) conducted with high map resolution reports tuber dormancy/sprouting QTL on chromosomes 1-8 and 10.Of these, chromosomes 8 and 10 QTL were not detected in the current study but all other QTL (i.e. on chromosomes 1-7) find higher correspondence with the GWAS QTL-MTAs reported here.None of the previous studies report QTL for this trait on chromosome 12 except Sharma et al. (2021) who suggest the presence of an effective QTL peak at the bottom end of this chromosome but the required statistical significance in their study was not achieved.The chromosome 12 QTL-MTA (Table 5) observed here strongly overlaps, and further strengthens, the chromosome 12 candidate QTL suggested by Sharma et al. (2021).Taken together QTL locations from current and previous studies reveal the involvement of all 12 potato chromosomes indicating a very complex genetic architecture for tuber dormancy/sprouting and that the trait is under the control of many genes acting at different loci.
van Eck et al. (2022) clearly demonstrate that use of RNAi 'knock downs' or overexpression of StOFP20 in cultivars with round or elongated tubers can significantly alter the shape of tubers towards the opposite extreme, suggesting that the StOFP20 gene is indeed responsible for controlling this trait in potato.

Tuber eye depth
Tuber eye depth is an important agronomic trait that affects the appearance and processing quality of tubers where potatoes with deep eyes contribute to significant peeling losses and are also not favoured by the consumers.To aid breeding potatoes with shallow eyes and uniform shape, several studies have been conducted in the past to identify genetic factors underlying this trait and identified a major locus controlling eye depth on chromosome 10 including minor effect QTLs on chromosomes 1, 2, 3, 4, 5, 6, 11, and 12 (Hara-Skrzypiec et al. 2018;Li et al. 2005;Lindqvist-Kreuze et al. 2015;Pandey et al. 2022;Prashar et al. 2014;Rosyara et al. 2016;Sharma et al. 2018;Śliwka et al. 2008;Totsky et al. 2020;Zhang et al. 2022;Zhao et al. 2023).GWAS here detected the most significant QTL for tuber eye depth on chromosome 10 corroborating the findings from the previous studies, and additional significant QTL-MTAs were revealed on chromosomes 3,4,6,7,9,10,11 and 12 (Table 5;Figs. 9,10; Supplementary Fig. S6).The chromosome 3 eye depth QTL reported by Prashar et al. (2014) is positioned at 61.87 Mb (solcap_snp_c2_37119, DMv4.03 (Sharma et al. 2013)) and that by Śliwka et al. (2008) is also located at the very bottom end of the same chromosome.The chromosome 3 eye depth QTL-MTA (chr03_60298153) observed here is in strong agreement to the QTL location mentioned in the above two studies.Hara-Skrzypiec et al. (2018) did not detect the major chromosome 10 QTL, however, the most important QTL for eye depth detected in their study was on chromosome 4 (0-33.6cM; 0-25.6 Mb; see above for cM-to-Mb conversion).The current study reports three QTL-MTAs on chromosome 4 (Table 5) and two (chr04_5763134 and chr04_10456693) of these support the chromosome 4 QTL reported by Hara-Skrzypiec et al. (2018).Zhao et al. (2023) found a QTL for eye depth on chromosome 6 (Chr06A2:25.9-26.4Mb (Wang et al. 2022)) but the same chromosome QTLs (chr06_40734919 and chr06_40735030, Table 5) detected here were ~ 14 Mb distance apart.Hara-Skrzypiec et al. (2018) identified another eye depth QTL towards the bottom part of chromosome 11 (44.8 cM; 34.2 Mb, DMv6.01; see above for cM-to-Mb conversion), however, all four QTL-MTAs detected in the current study were on the top part of the same chromosome (Table 5).Similarly, the chromosome 12 QTL observed by Lindqvist-Kreuze et al. (2015) in a diploid family of Andean potatoes ranged from 29.08 cM to 44.55 cM (QTL peak at 33.69 cM) while the QTL-MTA detected in the current study is located at the top end of chromosome 12 (chr12_1449252, Table 5).To the best of our knowledge, the chromosomes 7 and 9 QTL-MTAs are novel and not reported previously.

Total tubers per plant and tubers per stem
Tuber number is an economically important trait that affects marketable yield and suitability for specific markets.The trait is influenced by many agronomic and environmental factors but tuber number is somewhat genotype dependent (Celis-Gamboa et al. 2003).Previous genetic mapping studies on tuber number have identified QTLs on chromosomes 2, 4, 5, 7, 8 and 10 using biparental (Manrique-Carpintero et al. 2015, 2018;Rak et al. 2017) and association mapping (Schönhals et al. 2016;Zhang et al. 2022) approaches.The QTL-MTAs for tuber number detected in this study map to chromosomes 1, 4 and 9 (chr01_66016265, chr04_63755427 and chr09_50878917; Table 5; Figs. 9, 10; Supplementary Fig. S6).Out of the four QTLs on chromosome 4 reported by Manrique-Carpintero et al. (2018), one is located at the bottom end of the chromosome (143 cM equalling ~ 81.6 Mb; total genetic map length 1299.1 cM; cM-to-Mb conversion procedure as described above) but not so close to the QTL region observed here.Another study (Rak et al. 2017) reporting QTL on chromosome 4 for this trait finds the QTL peak at 81 cM (equalling ~ 57.7 Mb; genetic map length 1041.4 cM; cM-to-Mb conversion procedure as described above) with the closest sequence-based marker (solcap_ snp_c2_39439) positioned at 63.55 Mb (DMv4.03(Sharma et al. 2013)) matching very closely with the chromosome 4 QTL-MTA detected here.Among all other QTLs, the overlapping of chromosome 4 QTL across multiple studies suggests the importance of this locus in regulating tuber number.However, Rak et al. (2017) only observed the chromosome 4 QTL in their biparental mapping analysis and not GWAS conducted as part of the same study citing lack of power for the latter analysis.The current study overcomes this barrier providing a sequence-based QTL-MTA while extending the robustness and applicability of this potentially important chromosome 4 QTL locus in a wider germplasm.In addition to tubers per plant, the current study also investigated GWAS for tubers per stem.Three significant QTL-MTAs on chromosomes 5 and 6 were observed (Table 5; Figs. 9, 10; Supplementary Fig. S6).No previous study reports the genetic analysis for this trait.The QTL for 'tubers per stem' and chromosome 1 and 9 QTL for 'tubers per plant' traits reported here are novel to the current investigation and provide additional breeding targets for manipulating potato tuber number and yield.

Average tuber weight
Tuber weight is one of the main constituents of tuber yield, and together with total tuber number governs the overall productivity of the potato crop.Previous studies report genetic architecture for this trait spanning QTLs on various chromosomes viz., 1, 2, 4, 5, 6, 7 and 10 (Braun et al. 2017;Hara-Skrzypiec et al. 2018;Manrique-Carpintero et al. 2018).GWAS here revealed significant QTL-MTAs for average tuber weight on chromosomes 1 and 2 (Table 5; Figs. 9, 10; Supplementary Fig. S6).Braun et al. (2017) identified QTLs for average tuber weight on chromosome 1 spanning 71.29-77.19Mb genomic interval (DMv4.03(Sharma et al. 2013)).Hara-Skrzypiec et al. (2018) identified QTLs for this trait on chromosomes 1, 4, 5 and 6.The genomic interval for their chromosome 1 QTL ranged from ~ 27.6 to 70 Mb (36.2-91.8cM; total genetic map length 972; cM-to-Mb conversion procedure as described above).The chromosome 1 QTL-MTA (chr01_66991241, Table 5) revealed here lies in the genomic interval reported by this previous study, and is in gross agreement with the genetic locus identified by Braun et al. (2017).Another study (Manrique-Carpintero et al. 2018) reports QTLs on chromosomes 2, 4, 7 and 10 for this trait and the location of chromosome 2 QTLs (~ 53.1 Mb and ~ 59.9 Mb; 93 cM and 105 cM; total genetic map length 1299.1 cM; cM-to-Mb conversion procedure as described above) described by these authors broadly agree to the chromosome 2 QTL-MTA (chr02_40212654, Table 5) detected here.

Tuber yield
Tuber yield is the most important target trait for potato crop improvement and remains a prime focus for potato research and breeding.Tuber yield has a complex genetic architecture posing great challenges for breeding higher yielding potato varieties.During the last three decades of genetic mapping using various types of DNA-based markers applied over a number of diploid and tetraploid potato populations, exploiting linkage and association mapping approaches, several QTLs for tuber yield have been reported on chromosomes 1, 2, 4, 5, 6, 7 and 12 (Bonierbale et al. 1993;Bradshaw et al. 2008;Chen et al. 2021;Freyre and Douches 1994;Kreike et al. 1996;Manrique-Carpintero et al. 2015;Marand et al. 2019;McCord et al. 2011;Pereira et al. 2021b;Rosyara et al. 2016;Schafer-Pregl et al. 1998).Schonhals et al. (2017), in their analyses using 90 tetraploid genotypes involving contrasting phenotypes for tuber yield, tuber starch content and tuber starch yield, genotyped using RAD (Restriction-site Associated DNA) sequencing and Infinium 8k Potato SNP Array (Felcher et al. 2012;Hamilton et al. 2011), predicted that at least 200 QTL regions with one or more underlying genes on all chromosomes control the natural variation for these traits.Tuber yield GWAS conducted in the current investigation revealed QTL-MTAs distributed over chromosomes 1, 2, 3, 4, 6, 8 and 12 (Table 5; Figs. 9, 10; Supplementary Fig. S6) overlapping several of the QTL chromosomes mentioned above.

Height breadth ratio (HB ratio)
Plant height and breadth are important agronomic traits, affecting architecture and above ground biomass (foliage), the modification of which can improve general plant vigour, yield, ability of stress adaptation and several crop management aspects.The current study performed the genetic analysis on potato canopy architecture (using HB ratio as a proxy) assessed in a tetraploid diversity panel.Significant QTL regions were identified on chromosomes 3, 5, 10 and 12 (Table 5; Figs. 9, 10; Supplementary Fig. S6).The chromosome 5 QTL-MTA identified here (chr05_4944682) is in close vicinity to St-CDF1 gene (cycling DOF factor, Soltu.DM.05G005140, chr05:4485531-4488495, DMv6.1) known to control plant maturity in potato (Kloosterman et al. 2013).According to Bradshaw et al. (2004) a single locus, at the top end of chromosome 5, with allelic variation for a physiological trait is assumed to affect potato plant height and maturity and indirectly control resistance/susceptibility to blight.The closeness of chromosome 5 QTL to the well described maturity locus corroborates the above assumption associating plant height with maturity, and the additional QTL detected here potentially correspond to further loci regulating these traits.

Tuber dry matter
Tuber dry matter, specific gravity, and starch content are highly correlated traits (Houghland 1966;LeClerg 1947), and co-localization of QTLs for these traits is expected (McCord et al. 2011).Considering this the interpretation of QTL results from the studies on these three traits is used interchangeably.Tuber dry matter content, estimated by specific gravity primarily reflecting starch content, is an important agronomic trait for industrial and food processing impacting quality and consistency of the final processed products such as potato chips and French fries.The trait is influenced by genetic and environmental factors including genotype-by-environment interactions (Ruttencutter et al. 1979).Various studies (Bradshaw et al. 2008;Freyre and Douches 1994;Freyre et al. 1994;Li et al. 2019Li et al. , 2008;;Manrique-Carpintero et al. 2015;McCord et al. 2011;Park et al. 2021;Pereira et al. 2021b;Schafer-Pregl et al. 1998;Schönhals et al. 2016) on specific gravity conducted using diploid and tetraploid populations reveal the presence of QTLs for this trait on all 12 potato chromosomes.GWAS for tuber dry matter here detected two QTL-MTAs (chr10_58099346 and chr12_57450585; Table 5; Figs. 9, 10; Supplementary Fig. S6) on chromosomes 10 and 12 which find correspondence with previous studies (Li et al. 2008;Schafer-Pregl et al. 1998).The lack of sequence-based markers and limited resolution of the QTL positions in these two studies prevent a direct comparison with the associations detected in this investigation; however, the position of the chromosome 10 and 12 QTLs reported by these authors (for Li et al. (2008) chromosome 10 QTL only) finds good correspondence with the QTL-MTAs revealed here.Schönhals et al. (2016) also report chromosome 10 MTAs for tuber starch content in the genomic region spanning 50.94-55.85Mb (DMv4.03(Sharma et al. 2013)) which is in close vicinity to the chromosome 10 QTL-MTA (chr10_58099346) detected in the current study; however, the chromosome 12 MTAs reported in the two studies are at the opposite ends of the chromosome.
FSH and ACB show a significant correlation (− 0.26, Fig. 8) and the association of their QTL-MTAs on four distinct chromosomes is intriguing.Similarly, the co-existence of QTL-MTAs for ATW and TTU on chromosome 1 in the region where two  for YLD are also detected (Table 5).ATW displays significant correlations (Fig. 8) with TTU (− 0.68) and YLD (+ 0.63) and this QTL hotspot region on chromosome 1 potentially indicates an important target region for manipulating tuber yield.As we describe in the Phenotyping section, the strong correlation (− 0.42) between ACB (scoring, 1 severe) and DRM is interesting and the co-localization of their QTL-MTAs on chromosome 10 further indicates the interaction of genetic factors controlling these two important traits.DRM also shows significant correlations (Fig. 8) with SHP (− 0.24; scoring 1-6, 1 round) and EYE (− 0.27; scoring 1 deep, 9 shallow), and interestingly the significant QTL-MTAs for these two traits lie within 8 Mb distance of DRM QTL-MTA on chromosome 10.SHP and EYE showed a strong correlation (0.47) as stated above and the two traits also displayed co-localization of QTL-MTAs on chromosome 10.This has been well reported in previous studies (Lindqvist-Kreuze et al. 2015;Rosyara et al. 2016) and was further validated here.For one of the MTAs on chromosome 10, these two traits (SHP and EYE) also shared QTL region with UNI which displays a strong correlation with SHP (− 0.42) and the co-localization of MTAs for these three traits is not reported previously to the best of our knowledge.Overall, the QTL hotspots reported here add interesting insights governing these important traits and provide novel targets for potato breeding and research.

Conclusion
The current study reports the application of genotyping-bysequencing, a flexible and 'open' genotyping approach, in the genetic analysis of cultivated potato using a large and diverse association panel.Moreover, results from use of GBS are compared with the outputs from using a 'fixed' SNP genotyping platform.The GBS libraries, after the evaluation of a number of restriction enzyme combinations, were constructed using PstI/MseI restriction enzyme pair yielding SNPs largely confined (~ 90%) to genic or geneassociated region of the genome validating the effectiveness of selecting a methylation-sensitive restriction enzyme (PstI) for constructing GBS libraries.
Genetic analyses conducted using robust sets of GBS and Infinium SNPs did not detect clear separation of genotypes into distinct subpopulations as also reported previously for cultivated potato.Nevertheless, the analysis using both SNP sets indicated the presence of three subpopulations within the association panel and, between the two approaches, GBS SNPs were able to discriminate the three subgroups more effectively.The LD decay rates, assessed at the whole chromosome level, obtained here are comparable to those reported previously in potato, however, the estimates using GBS SNPs were significantly sharper than those displayed by the Infinium SNPs.The difference was more pronounced for chromosome 8 for which some of the previous studies as well as the current Infinium-based analyses revealed extremely conserved levels of LD decay as compared to other chromosomes, however, the chromosome 8 LD decay estimate using GBS SNPs was in good agreement with the levels observed for other chromosomes.
The germplasm panel displayed a broad range of phenotypic diversity for most traits analysed here with a wide heritability range and significant correlations between several trait pairs.GWAS were performed using a combined (GBS and Infinium) set of 30,363 SNPs yielding 1181 significant MTAs covering all traits included in the study.These were further converted to 189 unique QTL-MTAs following LDbased filtering and retaining MTAs with highest significant p value per LD window interval; and also keeping only the most significant MTA for SNPs appearing with multiple gene action models.The majority of QTL-MTAs obtained were from GBS SNPs highlighting their proximity to the causal variation and increased marker saturation and resolution achieved through GBS.This potentially also illustrates the utility of marker-dense de novo genotyping platforms in overcoming ascertainment bias and providing a more accurate correction for different levels of relatedness in GWAS models.
The study provides robust and LD-based filtered set of most significant QTL-MTAs for sixteen important production, agronomic and processing traits in potato.It is worth noting that most previous studies for these traits, besides a lack of any previous study for some of the traits, were performed on diploid biparental populations.The current study, however, deployed a large tetraploid varietal panel representing a larger diversity of the cultivated potato gene pool suggesting that the loci discovered here, especially those in common between the previous studies, may be the main ones controlling these highly important traits especially in the context of European potato germplasm.Moreover, the previous studies largely used low-throughput nonsequenced-tagged markers with a few employing fixed SNP platforms, whereas the current study exploited genome-wide genotyping and de novo SNP discovery simultaneously; and the reported sequence-tagged MTAs are likely to have higher transferability across a wider germplasm and increased utility for expediting genomics-assisted breeding and developing diagnostic markers for the studied traits.
GWAS here detected QTL hotspots for several traits spanning all 12 chromosomes covering known, such as for tuber shape and eye depth, as well as novel regions.Some of these traits also show significant correlations, and further efforts will be needed to unravel whether these trait QTL hotspots are governed by pleiotropic effects or independently acting but closely linked loci, or a combination of both.The observations further highlight the challenges in identifying functional genes underlying these QTL if the trait is studied in isolation of the other affected traits, notwithstanding the interplay of multiple mechanisms involved in trait inheritance.Overall, the genetic architecture for the studied traits and the QTL hotspots reported here illustrate the challenges faced in understanding and breeding for complex traits, but also provide a sound foundation for overcoming the hurdles encountered in improving such traits.
GWAS can be performed for a greater number of traits in a single study as opposed to developing bespoke populations for every single trait in traditional biparental mapping.The latter requires more resources, is time consuming and only displays diversity present in the two parents rather than a larger allele repertoire scanned through GWAS.Sequencebased genotyping performed here, exploiting methylationsensitive-GBS, targeted largely the euchromatic part of the genome with the increased ability, in a cost-effective manner, to saturate any population of interest with thousands of markers.The main finding reveals the effectiveness of flexible sequence-based genotyping over fixed SNP platform for identifying significant marker-trait associations with higher precision and resolution as well as detection of additional QTL.
assigned an average of 103.1 million net usable reads per GBS library (~ 4.3 million per sample) displaying a range of 82.1 (GBS Lib 07) to 131.5 (GBS Lib 09) million reads per library (Supplementary Table

Fig. 1
Fig. 1 Illustration of genomic DNA fragment size distribution from: (a) test GBS libraries obtained using six restriction enzyme pairs, (b) in silico restriction digestion prediction using six restriction enzyme pairs and, (c) representative 'Illumina sequencing ready' 24-plex GBS library constructed using PstI/MseI

Fig. 3
Fig. 3 Distribution of (a) GBS and (b) Infinium array SNPs used in all genetic analyses.Bin size = 1 Mb

Fig. 4
Fig. 4 Principal component analysis of 290 tetraploid potato GWAS panel clones using (a) 24,518 GBS SNPs and (b) 5845 Infinium array SNPs.The individual clones are coloured on the basis of their

Fig. 5
Fig. 5 Heatmap displaying relationships among 290 tetraploid potato GWAS panel clones using (a) 24,518 GBS SNPs and (b) 5845 Infinium array SNPs.The red diagonal represents perfect relationship of each line with itself; the symmetric off-diagonal elements represent relationship for pairs of lines where warmer colour indicates a positive relationship and colder colour denotes a negative relationship.The colours in the horizontal bar below dendrogram depict clones' membership to subpopulations (clusters) as detailed in Supplementary TableS7(GBS: Cluster1 = red, Cluster2 = green, Cluster3 = cyan; Infinium: Cluster1 = green, Cluster2 = red, Cluster3 = cyan)

Fig. 6
Fig. 6 Histogram plots for the diagonal and off-diagonal values of the genomic relationship matrix using GBS (24,518 SNPs; a, diagonal values; b, off-diagonal values) and Infinium array (5845 SNPs; c, diagonal values; d, off-diagonal values) datasets

Fig. 8
Fig. 8 Correlation heatmap of all 16 traits included in the study

Fig. 10
Fig. 10 Graphical illustration of the non-redundant set of unique QTL-MTAs (a, K models; b, QK models) listed in Table 5. Black dashed line depicts GWAS main significance threshold set to 1e-5 − log10(p) value, while actual significance thresholds obtained

Table 2
Details of GBS SNP effects by type and region a Percent SNPs out of the total SNP count for all 'effect type' categories b Percent SNPs out of the total SNP count for all 'effect region' categories

Table 3
Extent of LD decay in the GWAS panel estimated using GBS and Infinium array SNPs at whole chromosome level r 2 max,90 : maximum Pearson correlation coefficient (r 2 ) achieved in the 90th percentile LD 1/2max,90 : physical distance (Mb) at which LD has decayed to half its maximum r 2 value in the 90th percentile LD 1/10,90 : physical distance (Mb) at which LD has decayed to r 2 = 1/10 in the 90th percentile (Stich et al. 2013sSharma et al. 2018;Simko et al. 2006;Vos et al. 2017 revealed enhanced level of discrimination among the three subpopulation groups by using GBS SNPs than that achieved by the Infinium SNPs.Both SNP sets were also utilized to assess LD rates in the potato panel and the LD decay estimates obtained here were found to be comparable to those previously reported(Dhoop et al. 2010;Sharma et al. 2018;Simko et al. 2006;Vos et al. 2017) and contrast the only study(Stich et al. 2013) reporting very sharp LD decay in potato.As compared to the rest of the chromosomes, chromosome 8 has been reported to display extremely conserved levels of LD 1/10,90 , for example, 8 Mb (Berdugo-Cely et al. 2017) and 20.04 Mb(Sharma et al.

Table 5
List of unique QTL-marker-trait associations (QTL-MTAs) following LD-based filtering and retaining MTAs with most significant p value per LD window interval.For QTL-MTA SNPs appearing with multiple gene action models, only the SNP and gene action model combination giving the most significant p value was kept